Ordering Pipelined Query Operators with Precedence Constraints
نویسندگان
چکیده
We consider the problem of optimally arranging a collection of query operators into a pipelined execution plan in the presence of precedence constraints among the operators. The goal of our optimization is to maximize the rate at which input data items can be processed through the pipelined plan. We consider two different scenarios: one in which each operator is fixed to run on a separate machine, and the other in which all operators run on the same machine. Due to parallelism in the former scenario, the cost of a plan is given by the maximum (or bottleneck) cost incurred by any operator in the plan. In the latter scenario, the cost of a plan is given by the sum of the costs incurred by the operators in the plan. These two different cost metrics lead to fundamentally different optimization problems: Under the bottleneck cost metric, we give a general, polynomial-time greedy algorithm that always finds the optimal plan. However, under the sum cost metric, the problem is much harder: We show that it is unlikely that any polynomial-time algorithm can approximate the optimal plan to within a factor smaller than O(n), where n is the number of operators, and θ is some positive constant. Finally, under the sum cost metric, for the special case when the selectivity of each operator lies in [ǫ, 1− ǫ], we give an algorithm that produces a 2-approximation to the optimal plan but has running time exponential in 1/ǫ.
منابع مشابه
A Tree-Decomposition Approach to Parallel Query Optimization
In this paper we present an approach for transforming a relational join tree into a detailed execution plan with resource allocation information, for execution on a parallel machine. Our approach starts by transforming a query tree, such as might be generated by a sequential optimizer, into an operator tree which is then partitioned into a forest of linear chains of pipelined operators. We pres...
متن کاملA Foundation for the Replacement of Pipelined Physical Join Operators in Adaptive Query Processing
Adaptive query processors make decisions as to the most effective evaluation strategy for a query based on feedback received while the query is being evaluated. In essence, any of the decisions made by the optimizer (e.g., on operator order or on which operators to use) may be revisited in an adaptive query processor. This paper focuses on changes to physical operators (e.g., the specific join ...
متن کاملThe Pipelined Set Cover Problem
A classical problem in query optimization is to find the optimal ordering of a set of possibly correlated selections. We provide an abstraction of this problem as a generalization of set cover called pipelined set cover, where the sets are applied sequentially to the elements to be covered and the elements covered at each stage are discarded. We show that several natural heuristics for this NP-...
متن کاملContinuous Query Optimization
In large federated and shared-nothing databases, resources can exhibit widely uctuating characteristics. Assumptions made at the time a query is submitted will rarely hold throughout the duration of query processing. As a result, traditional static query optimization and execution techniques are ine ective in these environments. In this paper we introduce a query processing mechanism called an ...
متن کاملSINGLE MACHINE DUE DATE ASSIGNMENT SCHEDULING PROBLEM WITH PRECEDENCE CONSTRAINTS AND CONTROLLABLE PROCESSING TIMES IN FUZZY ENVIRONMENT
In this paper, a due date assignment scheduling problem with precedence constraints and controllable processing times in uncertain environment is investigated, in which the basic processing time of each job is assumed to be the symmetric trapezoidal fuzzy number, and the linear resource consumption function is used.The objective is to minimize the crisp possibilistic mean (or expected) value of...
متن کامل